Skip to content

Add cut-point rules for specific functions / intrinsics (via definition))#960

Merged
dkcumming merged 20 commits intomasterfrom
dc/filter-function-cut-point-definiton-version
Mar 3, 2026
Merged

Add cut-point rules for specific functions / intrinsics (via definition))#960
dkcumming merged 20 commits intomasterfrom
dc/filter-function-cut-point-definiton-version

Conversation

@dkcumming
Copy link
Collaborator

@dkcumming dkcumming commented Feb 28, 2026

This PR builds upon #931 modifying the approach in response to the comments on that PR. For full context read #931 first.

The kmir prove-rs flag --break-on-function is implemented in this PR as a compiled definition with hooked function to retrieve the function names to match on. This is similar to the already existing pattern that compiles the static data of a KMIR configuration into the definition. This allows for functions to be provided both when creating the initial proof, and when reading from disc (triggers a recompile of llvm if different flags are provided).

I added a test to demonstrate this working on functions and intrinsics, only matching those provided. I do not have a test from reading a partial proof and adding different function names - I did test it but it seemed a bit overboard for a test just now.

I did try the method with K shell access impure function, however this created branching for every function call since the result was stored in a symbolic value. I couldn't figure out how to get that working concretely (I don't think it is possible but might be wrong).

The `breakOnFunctions` cell will contain identifiers of functions that
should be broken on by the `termCallFunctionFilter` rule. Which is
identical to `termCallFunction` except it checks if the id is in the
`breakOnFunctions` cell and has a different identifier for the cut point
rules.
Fully qualified paths may have commas. Instead of using a separator use
multiple occurances of the flag
2 intrinsics and 2 functions are added, 1 of each is broken on only and
should appear in the output
Since the config does not change now there is no value in changing the
test files as it now only would change the non-deterministic tys
Copy link
Contributor

@Stevengre Stevengre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing this feature. But it looks like the changes in kast.py, test_decode_value.py and iterator-enumerate-fail.rs are unrelated files in this PR. Additionally, the filter may cost more time for execution. Have you tested the influence of it? I don't know if it's worth to introduce such a complexity and fallback for this feature. Maye be we should leave a debug branch just for these features?

@dkcumming
Copy link
Collaborator Author

Yes kast.py and test_decode_value.py I didn't revert back properly, I will fix that now. And iterator-enumerate-fail.rs was a different test I didn't meant to commit. I will fix that now

I will run with and without this for a handful of tests and see the timing difference. We could leave it on a debug branch, however from what @palinatolmach suggested this is already functionality that exists in Kontrol (although implemented differently), I feel it would be better to provide this to ourselves and any other potential users without the friction of accessing it through a debug branch. Of course if there is a significant timing regression we have to weigh that tradeoff, so I will get that information and post it here.

@dkcumming
Copy link
Collaborator Author

dkcumming commented Mar 2, 2026

Integration Tests MIR Semantics

I didn't filter the tests properly so I ran 1 extra one on this branch, but even with the extra test the times are pretty much the same.

daniel@daniel-MS-7E06 mir-semantics$ git rev-parse @
e23e477a3b53d1a6c4615a966ecf6f38ae883fb3
daniel@daniel-MS-7E06 mir-semantics$ time make test-integration
//-snip-
============================= 321 passed, 7 skipped in 3007.41s (0:50:07) ==============================

real    50m10.432s
user    284m3.711s
sys     8m32.724s
daniel@daniel-MS-7E06 mir-semantics-filter$ git rev-parse @
bffe2d185653f23e5dea1733d32908900c405fb6
daniel@daniel-MS-7E06 mir-semantics-filter$ time make test-integration TEST_ARGS='-k "not test_cli_break_on_function"'
============================= 322 passed, 7 skipped in 3043.85s (0:50:43) ==============================

real    50m44.602s
user    286m15.304s
sys     8m33.420s

P-Token Tests

master this PR
Approve single-thread 277 290
Revoke single-thread 214 225
MintTo single-thread 1290 1260
Burn --max-workers 4 2694 2695

Copy link
Member

@jberthold jberthold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall.
I agree this should be tested and not merged if it causes a major performance regression. Would be surprised if it was causing issues, though, as the function evaluation should be fast, probably faster than getting an environment variable.

Comment on lines +380 to +421
syntax String ::= getFunctionName(MonoItemKind) [function, total]
//---------------------------------------------------------------
rule getFunctionName(monoItemFn(symbol(NAME), _, _)) => NAME
rule getFunctionName(monoItemStatic(symbol(NAME), _, _)) => NAME
rule getFunctionName(monoItemGlobalAsm(_)) => ""
rule getFunctionName(IntrinsicFunction(symbol(NAME))) => NAME

// Check whether a function name matches any filter in the break-on-functions list.
syntax Bool ::= #functionNameMatchesEnv(String) [function, total]
//----------------------------------------------------------------
rule #functionNameMatchesEnv(NAME) => #functionNameMatchesEnvStr(NAME, #breakOnFunctionsString(0))

// The Int argument is unused; it exists only so the Haskell backend can
// pattern-match on it and not error since zero-argument functions cannot use [owise].
syntax String ::= #breakOnFunctionsString(Int) [function, total, symbol(breakOnFunctionsString)]
//-----------------------------------------------------------------------------------------------
rule #breakOnFunctionsString(_) => "" [owise] // This gets overridden by corresponding python function

syntax Bool ::= #functionNameMatchesEnvStr(String, String) [function, total]
//--------------------------------------------------------------------------
rule #functionNameMatchesEnvStr(_, "") => false
rule #functionNameMatchesEnvStr(NAME, ENV) => #functionNameMatchesAnyList(NAME, #splitSemicolon(ENV))
requires ENV =/=String ""

syntax List ::= #splitSemicolon(String) [function, total]
//--------------------------------------------------------
rule #splitSemicolon(S) => #splitSemicolonAux(S, findString(S, ";", 0))

syntax List ::= #splitSemicolonAux(String, Int) [function, total]
//-----------------------------------------------------------------
rule #splitSemicolonAux(S, -1) => ListItem(S)
rule #splitSemicolonAux(S, I) =>
ListItem(substrString(S, 0, I)) #splitSemicolon(substrString(S, I +Int 1, lengthString(S)))
requires I >=Int 0

syntax Bool ::= #functionNameMatchesAnyList(String, List) [function, total]
//-------------------------------------------------------------------------
rule #functionNameMatchesAnyList(_, .List) => false
rule #functionNameMatchesAnyList(NAME, ListItem(FILTER:String) REST) =>
0 <=Int findString(NAME, FILTER, 0) orBool #functionNameMatchesAnyList(NAME, REST)
rule #functionNameMatchesAnyList(_, _) => false [owise]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all seem like pure functions that could be moved to a helper module to keep this file cleaner/smaller. Is there a utils.md file we could move it to in order to keep this file cleaner?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have a utils.md but we definitely need one as we have many things scattered. I think that will be something done in another PR as it will be more comprehensive, I will co-ordinate with @Stevengre @mariaKt on this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do need some refactor to make the semantics easier to access with better structure. Should we write a proposal about how to manage and how to migrate?

@ehildenb
Copy link
Member

ehildenb commented Mar 2, 2026

Does it automatically detect if the function break arguments have not changed, and if so not rekompile? Or does the user have to manually rekompile with the desired flags?

@dkcumming
Copy link
Collaborator Author

If you use the normal uv/pyk interface prove-rs it will only recompile when needed and automatically

@dkcumming dkcumming merged commit 2c84b7f into master Mar 3, 2026
7 checks passed
@dkcumming dkcumming deleted the dc/filter-function-cut-point-definiton-version branch March 3, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants